Self-Driving Car Nano Degree Project 3

Network Design

I researched some of the published models for this problem. The approach explained in the following publications seemed promising and simple enough to be implemented for this project.

  1. Princeton DeepDriving (http://deepdriving.cs.princeton.edu/): As explained in the paper(https://arxiv.org/pdf/1505.00256.pdf) their model is based on AlexNet with 5 convolution layers followed by 4 fully connected layers with output dimensions of 4096, 4096, 256, and 13. The network is trained on 480K images @280x210 resolution for 140K iterations.

    Pro: This is a very large network and can learn to drive on difficult courses Cons: It is very expensive to train. When I tried training it a single epoch took about 10 mins.

  2. Nvidia SDC (https://arxiv.org/pdf/1604.07316v1.pdf): This too is loosely based on the AlexNet architecture with 5 conv layers follwoed by 4 fully connected layers.

    Pros: Simple and robust architecture Cons: Though smaller than the Princeton network it is still quite expensive to train.

  3. CommaAI (https://github.com/commaai/research): This model is much smaller than the other two with 3 conv layers followed by 2 fully connected layers. It took only 20 secs to train 1 epoch on my machine.

    Pros: Easy to train, still very effective Cons: Maybe not good enough for difficult courses.

I decided to use the CommaAI model for this project as it was very fast and initial results very quite promising. I was able to run a large number of experiments in a reasonable amount of time.

Model Details

| Layer (type)                    | Output Shape        | Param #   | Connected to             |
____________________________________________________________________________________________________

| lambda_1 (Lambda)               | (None, 80, 280, 3)  | 0         |  lambda_input_1[0][0]    |
____________________________________________________________________________________________________
| convolution2d_1 (Convolution2D) | (None, 20, 70, 16)  |  3088     |   lambda_1[0][0]         |
____________________________________________________________________________________________________
| elu_1 (ELU)                     | (None, 20, 70, 16)  |  0        |   convolution2d_1[0][0]  |
____________________________________________________________________________________________________
| convolution2d_2 (Convolution2D) | (None, 10, 35, 32)  |  12832    |   elu_1[0][0]            |
____________________________________________________________________________________________________
| elu_2 (ELU)                     | (None, 10, 35, 32)  |  0        |   convolution2d_2[0][0]  |
____________________________________________________________________________________________________
| convolution2d_3 (Convolution2D) | (None, 5, 18, 64)   |  51264    |   elu_2[0][0]            |
____________________________________________________________________________________________________
| flatten_1 (Flatten)             | (None, 5760)        |  0        |   convolution2d_3[0][0]  |
____________________________________________________________________________________________________
| dropout_1 (Dropout)             | (None, 5760)        |  0        |   flatten_1[0][0]        |
____________________________________________________________________________________________________
| elu_3 (ELU)                     | (None, 5760)        |  0        |   dropout_1[0][0]        |
____________________________________________________________________________________________________
| dense_1 (Dense)                 | (None, 512)         |  2949632  |   elu_3[0][0]            |
____________________________________________________________________________________________________
| dropout_2 (Dropout)             | (None, 512)         |  0        |   dense_1[0][0]          |
____________________________________________________________________________________________________
| elu_4 (ELU)                     | (None, 512)         |  0        |   dropout_2[0][0]        |
____________________________________________________________________________________________________
| dense_2 (Dense)                 | (None, 1)           |  513      |   elu_4[0][0]            |

Total params: 3,017,329 Trainable params: 3,017,329 Non-trainable params: 0

Training details:

  1. Image size I started training with full 160x320 resolution images but because of extensive background details the results were not very good. To cut out extraneous information and focus solely on the road I cropped the images to 80x280.

  2. Training data generation My initial model was based on mostly straight line driving, it was able to drive the car until the sharp left turn after the bridge. I realized that the course had a few peculiar turns after the bridge which were not present anywhere else on the track. I oversampled those turns in the training data by doing the following

    1. Zigzag driving: Drive the car all the way to the edge and bring it back to the center.
    2. Positive steering angles: There is only one turn on the track which requires a large positive angle. I drove a few round and recorded only that turn to gather more postive angle data.

With the above described training data I trained the model with the following hyperparams Epochs: 400 Batch size: 64 Learn rate: 0.0001

Optimizer: Adam
In [2]:
from IPython.display import Image, display
import matplotlib.pyplot as plt

#Here are some examples of how I collected training data for difficult turns
#on the course

# Training for navigating sharp left turn

display(Image(filename='./files/left1.png'))
display(Image(filename='./files/left2.png'))
In [4]:
# Training for recovering from under steering on a sharp left turn

display(Image(filename='./files/left3.png'))
display(Image(filename='./files/left4.png'))
In [3]:
# Training for navigating sharp right turn

display(Image(filename='./files/right1.png'))
display(Image(filename='./files/right2.png'))
In [5]:
# Training for recovery from going off road

display(Image(filename='./files/off1.png'))
display(Image(filename='./files/off2.png'))
In [9]:
import tables

#load train and test data

h5file = tables.open_file('./drive_data.h5', mode='r', title="drive_data")
y_train = h5file.get_node("/train_labels").read()

plt.hist(y_train)
plt.xlabel('Steering Angle')
plt.ylabel('Number of Samples')
plt.show()

#The histogram below shows the distirbution of steering angles in the training
#data. The right and left turns (Positive angles) are oversampled and straight
#driving examples are undersampled.
In [ ]: